Goto

Collaborating Authors

 Akwa Ibom State


Ibom NLP: A Step Toward Inclusive Natural Language Processing for Nigeria's Minority Languages

Kalejaiye, Oluwadara, Beyene, Luel Hagos, Adelani, David Ifeoluwa, Edet, Mmekut-Mfon Gabriel, Akpan, Aniefon Daniel, Urua, Eno-Abasi, Andy, Anietie

arXiv.org Artificial Intelligence

Nigeria is the most populous country in Africa with a population of more than 200 million people. More than 500 languages are spoken in Nigeria and it is one of the most linguistically diverse countries in the world. Despite this, natural language processing (NLP) research has mostly focused on the following four languages: Hausa, Igbo, Nigerian-Pidgin, and Yoruba (i.e <1% of the languages spoken in Nigeria). This is in part due to the unavailability of textual data in these languages to train and apply NLP algorithms. In this work, we introduce ibom -- a dataset for machine translation and topic classification in four Coastal Nigerian languages from the Akwa Ibom State region: Anaang, Efik, Ibibio, and Oro. These languages are not represented in Google Translate or in major benchmarks such as Flores-200 or SIB-200. We focus on extending Flores-200 benchmark to these languages, and further align the translated texts with topic labels based on SIB-200 classification dataset. Our evaluation shows that current LLMs perform poorly on machine translation for these languages in both zero-and-few shot settings. However, we find the few-shot samples to steadily improve topic classification with more shots.


A Real-Time Framework for Intermediate Map Construction and Kinematically Feasible Off-Road Planning Without OSM

Jerome, Otobong, Kulathunga, Geesara Prathap, Dmitry, Devitt, Murawjow, Eugene, Klimchik, Alexandr

arXiv.org Artificial Intelligence

Off-road environments present unique challenges for autonomous navigation due to their complex and unstructured nature. Traditional global path-planning methods, which typically aim to minimize path length and travel time, perform poorly on large-scale maps and fail to account for critical factors such as real-time performance, kinematic feasibility, and memory efficiency. This paper introduces a novel global path-planning method specifically designed for off-road environments, addressing these essential factors. The method begins by constructing an intermediate map within the pixel coordinate system, incorporating geographical features like off-road trails, waterways, restricted and passable areas, and trees. The planning problem is then divided into three sub-problems: graph-based path planning, kinematic feasibility checking, and path smoothing. This approach effectively meets real-time performance requirements while ensuring kinematic feasibility and efficient memory use. The method was tested in various off-road environments with large-scale maps up to several square kilometers in size, successfully identifying feasible paths in an average of 1.5 seconds and utilizing approximately 1.5GB of memory under extreme conditions. The proposed framework is versatile and applicable to a wide range of off-road autonomous navigation tasks, including search and rescue missions and agricultural operations.


Class-Level Feature Selection Method Using Feature Weighted Growing Self-Organising Maps

Starkey, Andrew, Akpan, Uduak Idio, Hosni, Omaimah AL, Pullissery, Yaseen

arXiv.org Artificial Intelligence

There have been several attempts to develop Feature Selection (FS) algorithms capable of identifying features that are relevant in a dataset. Although in certain applications the FS algorithms can be seen to be successful, they have similar basic limitations. In all cases, the global feature selection algorithms seek to select features that are relevant and common to all classes of the dataset. This is a major limitation since there could be features that are specifically useful for a particular class while irrelevant for other classes, and full explanation of the relationship at class level therefore cannot be determined. While the inclusion of such features for all classes could cause improved predictive ability for the relevant class, the same features could be problematic for other classes. In this paper, we examine this issue and also develop a class-level feature selection method called the Feature Weighted Growing Self-Organising Map (FWGSOM). The proposed method carries out feature analysis at class level which enhances its ability to identify relevant features for each class. Results from experiments indicate that our method performs better than other methods, gives explainable results at class level, and has a low computational footprint when compared to other methods.


Mitigating Translationese in Low-resource Languages: The Storyboard Approach

Kuwanto, Garry, Urua, Eno-Abasi E., Amuok, Priscilla Amondi, Muhammad, Shamsuddeen Hassan, Aremu, Anuoluwapo, Otiende, Verrah, Nanyanga, Loice Emma, Nyoike, Teresiah W., Akpan, Aniefon D., Udouboh, Nsima Ab, Archibong, Idongesit Udeme, Moses, Idara Effiong, Ige, Ifeoluwatayo A., Ajibade, Benjamin, Awokoya, Olumide Benjamin, Abdulmumin, Idris, Aliyu, Saminu Mohammad, Iro, Ruqayya Nasir, Ahmad, Ibrahim Said, Smith, Deontae, Michaels, Praise-EL, Adelani, David Ifeoluwa, Wijaya, Derry Tanti, Andy, Anietie

arXiv.org Artificial Intelligence

Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.


Difference of Probability and Information Entropy for Skills Classification and Prediction in Student Learning

Ehimwenma, Kennedy Efosa, Sharji, Safiya Al, Raheem, Maruf

arXiv.org Artificial Intelligence

The probability of an event is in the range of [0, 1]. In a sample space S, the value of probability determines whether an outcome is true or false. The probability of an event Pr(A) that will never occur = 0. The probability of the event Pr(B) that will certainly occur = 1. This makes both events A and B thus a certainty. Furthermore, the sum of probabilities Pr(E1) + Pr(E2) + ... + Pr(En) of a finite set of events in a given sample space S = 1. Conversely, the difference of the sum of two probabilities that will certainly occur is 0. Firstly, this paper discusses Bayes' theorem, then complement of probability and the difference of probability for occurrences of learning-events, before applying these in the prediction of learning objects in student learning. Given the sum total of 1; to make recommendation for student learning, this paper submits that the difference of argMaxPr(S) and probability of student-performance quantifies the weight of learning objects for students. Using a dataset of skill-set, the computational procedure demonstrates: i) the probability of skill-set events that has occurred that would lead to higher level learning; ii) the probability of the events that has not occurred that requires subject-matter relearning; iii) accuracy of decision tree in the prediction of student performance into class labels; and iv) information entropy about skill-set data and its implication on student cognitive performance and recommendation of learning [1].


Technical Opinion: From Animal Behaviour to Autonomous Robots

Ezenkwu, Chinedu Pascal, Starkey, Andrew

arXiv.org Artificial Intelligence

As the scope for robotic applications extends from structured to unstructured and more complex environments, autonomy has become a desideratum for most of today's robots. The practice of handcrafting robots does not give them the capability to cope with unforeseen situations. Although several research contributions have been made towards robot autonomy, we are nowhere near the level of autonomy that is exhibited by animals, even ones at the lowest biological level of organisation. This is because animals are born with innate capabilities, both in their body structure and intelligence, to survive and develop in their milieus; their behaviours and sometimes their morphological traits can evolve to adapt to persistent changes in their habitats. For example, Corcoran et al [1] studied the co-evolutionary battle between the bat and the moth.


Data-driven Air Quality Characterisation for Urban Environments: a Case Study

Zhou, Yuchao, De, Suparna, Ewa, Gideon, Perera, Charith, Moessner, Klaus

arXiv.org Machine Learning

The economic and social impact of poor air quality in towns and cities is increasingly being recognised, together with the need for effective ways of creating awareness of real-time air quality levels and their impact on human health. With local authority maintained monitoring stations being geographically sparse and the resultant datasets also featuring missing labels, computational data-driven mechanisms are needed to address the data sparsity challenge. In this paper, we propose a machine learning-based method to accurately predict the Air Quality Index (AQI), using environmental monitoring data together with meteorological measurements. To do so, we develop an air quality estimation framework that implements a neural network that is enhanced with a novel Non-linear Autoregressive neural network with exogenous input (NARX), especially designed for time series prediction. The framework is applied to a case study featuring different monitoring sites in London, with comparisons against other standard machine-learning based predictive algorithms showing the feasibility and robust performance of the proposed method for different kinds of areas within an urban region.


Tunde Adegbola - Wikipedia

#artificialintelligence

Tunde Adegbola, born 1 August 1955, also known as T. A. or Uncle T, is a scientist, musician, engineer, linguist and culture activist. He is best known for his work in setting up most of the pioneering private Television and Radio stations in Nigeria. He is the founder of TIWA systems, and the Executive Director of Alt-i (African Languages Technology Initiative). Tunde completed a bachelor's degree in Electrical Engineering at the University of Lagos, and later specialized in broadcast technology. He subsequently obtained a master's degree in Computer Science from the University of Wales (Swansea).


Russia launches facial recognition programme to find anyone's face on Twitter

The Independent - Tech

A Russian company has launched a programme that can identify a stranger among 300 million Twitter users in less than a second. The social media platform has responded to the new software, called "FindFace", saying it its use is in "violation" of its rules and it is taking the matter "very seriously". Trump'obviously aware' Russia behind election hacks, White House says Syria's Assad says Donald Trump will be Russia's'natural ally' Trump'obviously aware' Russia behind election hacks, White House says Syria's Assad says Donald Trump will be Russia's'natural ally' "We see lots of opportunities for Twitter users on the service," Artem Kukharenko, co-founder of NTechLab told BuzzFeed. "We think this is something many people will use," he added, claiming the technology could be used to reduce spam profiles. "Not in the US, but in other countries there is a real problem of politicians, reporters, finding that someone created a fake account for them. "I was involved back in Russia with scandals with a fake account posing as a politicians that tweeted something and created political scandal." he said. Christopher Weatherhead, Technologist at Privacy International said: "The software created by NTechLab highlights the ease to which cross-referencing profiles photos is possible.